Explaining the hierarchy of visual representational geometries by remixing of features from many computational vision models
نویسندگان
چکیده
Visual processing in cortex happens through a hierarchy of increasingly sophisticated representations. Here we explore a very wide range of model representations (29 models), testing their categorization performance (animate/inanimate) and their ability to account for the representational geometry of brain regions along the visual hierarchy (V1, V2, V3, V4, and LO). We also created new model instantiations (85 model instantiations in total) by reweighting and remixing of the model features. Reweighting and remixing was based on brain responses to an independent training set of 1750 images. We assessed the models with representational similarity analysis (RSA), which characterizes the geometry of a representation by a representational dissimilarity matrix (RDM). In this study, the RDM is either computed on the basis of the model features or on the basis of predicted voxel responses. Voxel responses are predicted by linear combinations of the model features. The model features are linearly remixed so as to best explain the voxel responses (as in voxel/population receptive-field modelling). This new approach of combining RSA with voxel receptive field modelling may help bridge the gap between the two methods. We found that early visual areas are best accounted for by a Gabor wavelet pyramid (GWP) model. The GWP implementations we used performed similarly with and without remixing, suggesting that the original features already approximate the representational space, obviating the need for remixing or reweighting. The lateral occipital region (LO), a higher visual representation, was best explained by the higher layers of a deep convolutional network (Krizhevsky et al., 2012). However, this model could explain the LO representation only after appropriate remixing of its feature set. Remixed RSA takes a step in an important direction, where each computational model representation is explored more broadly by considering not only its representational geometry, but the set of all geometries within reach of a linear transform. The exploration of many models and many brain areas may lead to a better understanding of the processing stages in the visual hierarchy, from low-level image representations in V1 to visuo-semantic representations in higher-level visual areas. . CC-BY-NC-ND 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/009936 doi: bioRxiv preprint first posted online Oct. 3, 2014;
منابع مشابه
Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation
Inferior temporal (IT) cortex in human and nonhuman primates serves visual object recognition. Computational object-vision models, although continually improving, do not yet reach human performance. It is unclear to what extent the internal representations of computational models can explain the IT representation. Here we investigate a wide range of computational model representations (37 in to...
متن کاملFixed versus mixed RSA: Explaining visual representations by fixed and mixed feature sets from shallow and deep computational models
Studies of the primate visual system have begun to test a wide range of complex computational object-vision models. Realistic models have many parameters, which in practice cannot be fitted using the limited amounts of brain-activity data typically available. Task performance optimization (e.g. using backpropagation to train neural networks) provides major constraints for fitting parameters and...
متن کاملRepresentation of similarity as a goal of early visual processing
We consider the representational capabilities of systems of receptive elds found in early mammalian vision, under the assumption that the successive stages of processing remap the retinal representation space in a manner that makes objectively similar stimuli (such as diierent views of the same 3D object) closer to each other, and dissimilar stimuli farther apart. We present theoretical analysi...
متن کاملExplaining the Level of Human Thought in the Parallel Civilizations Based on Formal Structure and Visual Imagination Formed in Mythical Narratives
Myth, like any other form of narrative, has an undeniable role in visual imagination based on the foundations of mythical thought. Ernst Cassirer, by recovering the fundamental principles of mythical thought, brings against them to the foundations of contemporary rational thought and defines the fundamental features of mythical thought as compared to modern rational thought. He also believes t...
متن کاملDeep neural networks: a new framework for modelling biological vision and brain information processing
Recent advances in neural network modelling have enabled major strides in computer vision and other artificial intelligence applications. Human-level visual recognition abilities are coming within reach of artificial systems. Artificial neural networks are inspired by the brain and their computations could be implemented in biological neurons. Convolutional feedforward networks, which now domin...
متن کامل